Identifying native-like protein structures using physics-based potentials

نویسندگان

  • Brian N. Dominy
  • Charles L. Brooks
چکیده

As the field of structural genomics matures, new methods will be required that can accurately and rapidly distinguish reliable structure predictions from those that are more dubious. We present a method based on the CHARMM gas phase implicit hydrogen force field in conjunction with a generalized Born implicit solvation term that allows one to make such discrimination. We begin by analyzing pairs of threaded structures from the EMBL database, and find that it is possible to identify the misfolded structures with over 90% accuracy. Further, we find that misfolded states are generally favored by the solvation term due to the mispairing of favorable intramolecular ionic contacts. We also examine 29 sets of 29 misfolded globin sequences from Levitt's "Decoys 'R' Us" database generated using a sequence homology-based method. Again, we find that discrimination is possible with approximately 90% accuracy. Also, even in these less distorted structures, mispairing of ionic contacts results in a more favorable solvation energy for misfolded states. This is also found to be the case for collapsed, partially folded conformations of CspA and protein G taken from folding free energy calculations. We also find that the inclusion of the generalized Born solvation term, in postprocess energy evaluation, improves the correlation between structural similarity and energy in the globin database. This significantly improves the reliability of the hypothesis that more energetically favorable structures are also more similar to the native conformation. Additionally, we examine seven extensive collections of misfolded structures created by Park and Levitt using a four-state reduced model also contained in the "Decoys 'R' Us" database. Results from these large databases confirm those obtained in the EMBL and misfolded globin databases concerning predictive accuracy, the energetic advantage of misfolded proteins regarding the solvation component, and the improved correlation between energy and structural similarity due to implicit solvation. Z-scores computed for these databases are improved by including the generalized Born implicit solvation term, and are found to be comparable to trained and knowledge-based scoring functions. Finally, we briefly explore the dynamic behavior of a misfolded protein relative to properly folded conformations. We demonstrate that the misfolded conformation diverges quickly from its initial structure while the properly folded states remain stable. Proteins in this study are shown to be more stable than their misfolded counterparts and readily identified based on energetic as well as dynamic criteria. In summary, we demonstrate the utility of physics-based force fields in identifying native-like conformations in a variety of preconstructed structural databases. The details of this discrimination are shown to be dependent on the construction of the structural database.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

All-Atom Four-Body Knowledge-Based Statistical Potentials to Distinguish Native Protein Structures from Nonnative Folds

Recent advances in understanding protein folding have benefitted from coarse-grained representations of protein structures. Empirical energy functions derived from these techniques occasionally succeed in distinguishing native structures from their corresponding ensembles of nonnative folds or decoys which display varying degrees of structural dissimilarity to the native proteins. Here we utili...

متن کامل

Atomic Four - Body Statistical Potential for Macromolecular Structure Analysis

Over recent years, exponential growth of the Protein Data Bank (PDB) has facilitated selection of larger, nonredundant subsets of experimentally solved macromolecular structures at higher resolutions, which in turn have provided the data used in developing more effective knowledge-based statistical potentials for improved structure prediction. In contrast to physics-based energy functions, stat...

متن کامل

Anisotropic coarse-grained statistical potentials improve the ability to identify native-like protein structures

We present a new method to extract distance and orientation dependent potentials between amino acid side chains using a database of protein structures and the standard Boltzmann device. The importance of orientation dependent interactions is first established by computing orientational order parameters for proteins with a-helical and b-sheet architecture. Extraction of the anisotropic interacti...

متن کامل

In quest of an empirical potential for protein structure prediction.

Key to successful protein structure prediction is a potential that recognizes the native state from misfolded structures. Recent advances in empirical potentials based on known protein structures include improved reference states for assessing random interactions, sidechain-orientation-dependent pair potentials, potentials for describing secondary or supersecondary structural preferences and, m...

متن کامل

Evaluation of atomic level mean force potentials via inverse folding and inverse refinement of protein structures: atomic burial position and pairwise non-bonded interactions.

Two atomic level knowledge-based mean force interaction potentials (KBPs), a centrosymmetric burial position term and a long-range pairwise term, were developed. These were tested by comparing multiple configurations of three structurally unrelated proteins and were found successfully to (i) discriminate native state proteins from grossly misfolded structures in inverse folding tests, (ii) rank...

متن کامل

On the transferability of folding and threading potentials and sequence-independent filters for protein folding simulations

Significant progress has recently been made in de novo protein structure prediction. The Rosetta method by Baker and colleagues, which is based on the idea of assembling putative models from a library of k-mer fragments derived from known three-dimensional protein structures, proved to be particularly successful. Critical components of the Rosetta approach are various sequence-dependent as well...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of computational chemistry

دوره 23 1  شماره 

صفحات  -

تاریخ انتشار 2002